Decision Tree Analysis on J48 and Random Forest Algorithm for Data Mining Using Breast Cancer Microarray Dataset

نویسندگان

  • Ajay Kumar Mishra
  • Kumar Pani
  • Bikram Keshari Ratha
چکیده

Data mining which involves systematic analyses of large datasets for extracting the knowledge. Classification is considered as one of the major basic research topics that manage the data. Due to the rapid developments in microarray technology and it offer the capability to measure expression levels of thousands of genes simultaneously. study of such data helps us discovering different clinical outcomes that are caused by expression of a few predictive genes. Decision tree models help in predicting new data. In this work, we make a comparison of Decision Tree algorithms. We use two most popular algorithms namely basically J48 and Random Forest using Breast cancer microarray dataset which is available at UCI machine learning repository.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study of Tree Base Data Mining Algorithms for Network Intrusion Detection

Internet growth has increased rapidly due to which number of network attacks have been increased. This emphasis importance of network intrusion detection systems (IDS) for securing the network. It is the process of monitoring and analyzing network traffic for detecting security violations many researcher suggested data mining technique such as classification, clustering ,pattern matching and ru...

متن کامل

Gene Expression Data Analysis Using Data Mining Algorithms for Colon Cancer

The concept of Data mining is used in various medical applications like tumor classification, protein structure prediction, gene classification, cancer classification based on microarray data, clustering of gene expression data, statistical model of protein-protein interaction etc. Adverse drug events in prediction of medical test effectiveness can be done based on genomics and proteomics throu...

متن کامل

An Automated Diagnosis Of Breast Cancer Using Farthest First Clustering And Decision Tree J48 Classifier

Breast cancer is one of the most widespread and deadly cancer for women. Early diagnosis and treatment of breast cancer can enhance the outcome of the patients. Due to the difficulties of outlier and skewed data, the prediction of breast cancer survey has presented many challenges in the field of data mining and pattern recognition .To solve these troubles, we have proposed an automated breast ...

متن کامل

A Perspective Analysis of Traffic Accident using Data Mining Techniques

Data Mining is taking out of hidden patterns from huge database. It is commonly used in a marketing, surveillance, fraud detection and scientific discovery. In data mining, machine learning is mainly focused as research which is automatically learnt to recognize complex patterns and make intelligent decisions based on data. Nowadays traffic accidents are the major causes of death and injuries i...

متن کامل

ADABOOST ENSEMBLE ALGORITHMS FOR BREAST CANCER CLASSIFICATION

With an advance in technologies, different tumor features have been collected for Breast Cancer (BC) diagnosis, processing of dealing with large data set suffers some challenges which include high storage capacity and time require for accessing and processing. The objective of this paper is to classify BC based on the extracted tumor features. To extract useful information and diagnose the tumo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015